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Abstract. Inversions of rotational splittings have shown that 
there exists at the base of the solar convection zone a region 
called the tachocline in which high radial gradients of the rota- 
tion rate occur. The usual linear regularization methods tend to 
smooth out any high gradients in the solution, and may not be 
appropriate for the study of this zone. In this paper we use, in 
the helioseismic context of rotation inversions, regularization 
methods that have been developed for edge-preserving regu- 
larization in computed imaging. It is shown from Monte-Carlo 
simulations that this approach can lead directly to results sim- 
ilar to those reached by linear inversions which however re- 
quired some assumptions on the shape of the transition in order 
to be deconvolved. The application of this method to LOWL 
data leads to a very thin tachocline. From the discussions on 
the parameters entering the inversion and the Monte-Carlo sim- 
ulations, our conclusion is that the tachocline width is very 
likely below O.O5i?0 which lowers our previous estimate of 
0.05 ± O.O3i?0 obtained from the same dataset (Corbard et al. 
1998). 
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1. Introduction 

Helioseismic inverse problems consist in using the properties 
(namely frequencies or frequency splittings) of the oscillation 
pattern observed at the surface of the Sun in order to infer the 
internal variation of solar physical properties like the rotation 
rate, the sound speed or the density (see e.g. Gough & Thomp- 



(Phillips 1962; Twomey 1963; Tikhonov 1963) assumes a 



son (1991) for a review). These problems can be expressed in 



terms of integral equations whic h represent ill-posed problems 
in the sense of Hadamard ( 1923 ). 

The traditional approach used to override this difficulty 
consists in regularizing the problem by adding some a pri- 
ori information on the solution (e.g. Kirsch 1996; Craig & 
Brown 1986). The well known Tikhonov regularization method 
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global smoothness of the solution by minimizing the norm of 
its derivative at a given order. Nevertheless, inversions for the 



solar rotation have shown (see e.g. Thompson et al. ( 1996 1 and 
Schou et al. ( 19981 ) f° r m e latest results) that high gradients ex- 
ist in the solar rotation profile near the surface and at the base of 
the con vectio n zone in the so-called solar tachocline (Spiegel 
&Zahn|l992|). 



Therefore, enforcing global smoothness a priori may not be 
appropriate for the study of these zones, which are of particular 
interest for the study of solar dynamics. As a matter of fact, the 
tachocline represents a thin zone where the differential rotation 
of the convection zone becomes rigid in the radiative interior. 
It is thought to be the place where the solar dynamo originates 
and its precise structure is an important test for angular mo- 
mentum transport theories. More precisely, the thickness of the 
tachocline can be related to the horizontal component of the 
turbulent viscosity and may be used as an important observa- 
tional constraint on the dynamics properties of the turbulence 



(Spiegel & Zahn 1992; Gough & Sekii 1998; Elliot 1997). 



Several works have already been perfo rmed to inf er the 
fine structure of the tachocline (Kosovichev 
Charbonneau et al. 



1997 



1998; Antia et al. 



1998 



Corbard et 

al. I1998J) using both forward analysis and inverse techniques. 



1996,1998; Basu 



For the inverse approach, it may be interesting to change the 
global constraint which tends to smooth out every high gradi- 
ents in the solution and to find a way to preserve these gradi- 
ents in the inversion process. A first attempt in that direction 
has been carried out (Corbard et al. 1998| ) by using a nonlin- 
ear regularization term based on the Piecewise Polynomials- 
Truncated Singular Value Decomposition (PP-TSVD) method 
(Hansen & Mosegaard 1996| ) which uses a 1-norm. Neverthe- 
less it has been shown that this method can produce solutions 
with sharp discontinuities even if the width of the tachocline is 
relatively large. This leads to very large error bars on the in- 
ferred width of high gradient zones and complicates the inter- 
pretation of the results obtained from real observations. More 
elaborate nonlinear techniques have been developed for edge- 
preserving regularization in computed imaging. In this paper, 
we investigate their use in the helioseismic context. 



2 



T. Corbard et al.: Non linear regularization for helioseismic inversions 



Section |] briefly recalls how the solar internal rotation can 
be related to the frequency splittings determined from helio- 
seismic measurements, and introduces the corresponding dis- 
cretized inverse problem. The non linear approach of regular- 
ization in inverse techniques is introduced in Sect. |[ and the 
computational aspects are discussed. For the particular case of 
solar rotation inversion and the determination of the tachocline 
width, Sects. [| and |5] present how the regularizing parameters 
have been chosen and give the results of Monte-Carlo simu- 
lations for the estimation of the uncertainty on the tachocline 
width. Finally, the results obtained with LOWL data are dis- 
cussed in Sect. [| and we conclude in Sect. 0. 

2. The astrophysical problem and its discretization 

The internal rotation f2(r, 8) of the Sun expressed as a function 
of the solar radius r and colatitude 6 can be related (Hansen 
et al. 1977), through a 2D integral equation, to the observed 
frequency splittings Af„Zm where each mode of solar acoustic 
oscillations is characterized by the degree I, the radial order n 
and the azimuthal order m (—1 <m<T). 

In this work, we focus on the application of non linear in- 
version to the description of the tachocline profile in the equa- 
torial plane. The so-called 'tachocline parameters' are obtained 
by fitting, between O.4i? and 0.8i?©, the solution f2(r) of the 
inversion by an error function (erf) of the form: 



fi/«(r) = fio- 



l+erf 



Q.5w 



+a(r - 0.7).(1) 



This defines five tachocline parameters: flo, f2j., f c , w and a. 
The co efficient a has been introduced, following Antia et al. 
( 1998 ), in order to take into account the linear behaviour some- 
times found for the rotation rate in the convection zone just 
above the transition in the equatorial plane (see Sect. ||). 

Thus, we consider the ID problem of inferring the so- 
lar equatorial rotation profile Vl eq = f2(r, 90°) from the sum 
of odd-indexed a-coefficients defined by the expansio n of t he 
splittings on orthogonal polynomials (e.g. Schou et al. 1994) 



i=i,3,. 



K n i(r) VL eq (r) dr, 



(2) 



where K n i(r) are the so-called rotational kernels, which have 
been calculated for each mode from a solar model taken from 



Morel et al. (1997). In the following, they are assumed to be 



known exactly. 

This approximation of the 2D integral equation is valid only 



for high degree modes (e.g. Corbard 1997) but the influence of 



the low degree modes on the determination of the position and 
width of the tachocline and the rotation rate of the upper layers 
is thought to be small. 

We discretize Eq. (Q) by using a polynomial expansion 
method, which leads to the matrix equation: 



W = Ml 



(3) 



where we have defined the vector W = (Wi/ai)i = i,N, of N 
truncated sum of a-coefficients Wi — Y^jLi a j weighted by 
the standard deviation <7j for each mode i = (n, I). The num- 
ber rii of a-coefficients is fixed by the observations for each 
mode. The standard deviations have been computed straight- 
forwardly from the uncertainties quoted on each a-coefficient. 
In this work, we assume no error correlation between the dif- 
ferent modes. 

We seek a solution f2(r) defined as a piecewise linear func- 
tion of the radius: 



fi(r) 



fi = (w p ) p= i j 



(4) 



where ipp( r )^ P = 1) N p are piecewise straight lines (N p = 100 
in this work) between fixed break points distributed according 
to the density of turning points of the modes (cf. Corbard et al. 



1997). The discretization matrix R is then defined by: 



R — (Rip) 4=1 



Rip = — I Ki(r)ip p (r)dr 



(5) 



An inverse method should lead to a solution that is able to 
produce a good fit to the data. We define the goodness-of-fit by 
the x 2 value obtained for any solution Q(r): 



Wi - J Re Ki(r)n(r)dr 



a, 



which can be written in the discretized form: 



x 2 m = \\mi-w\\i 



(6) 



(7) 



3. Regularization: the non linear approach 

3.1. generalized regularization term and Euler equations 

Unfortunately, the inverse integral problem is an ill-posed prob- 
lem and the minimization of only the \ 2 value generally leads 
to oscillatory solutions that are not 'physically acceptable' in 
the sense that they do not correspond to our a priori knowl- 
edge on the shape of the solution. So, we have to use regular- 
ization techniques i.e. to introduce a priori information in the 
minimization process. A large class of these techniques, can be 
expressed in the general form of the minimization of a criterion 
J over the unknown solution fi(r): 



J(n(r))=X 2 mr)) + \ 2 



9 



d q n(r) 



dr<i 



dr, 



(8) 



where A is the so-called trade-off parameter, chosen so that it 
establishes a balance between the goodness-of-fit to the data 
and the constraint introduced on the solution. The parameter 
5 allows to fix the threshold on the gradient modulus of the 
solution under which it is smoothed, and above which it is pre- 
served (cf. Sect. Q). The order q of the derivative is usually 
taken equal to one or two. The two choices can lead to similar 



T. Corbard et al.: Non linear regularization for helioseismic inversions 



3 



results with the appropriate choice of the regularizing param- 
eter A in the domains where the solution is well constrained 
by the data. However these two choices correspond to two dif- 
ferent a priori constraints on the solution. As the rotation is 
known to be quasi-rigid in the radiative interior (at least down 
to O.4i? ), we have chosen in this work to use the first deriva- 
tive. We note however that the method described below can 
easily be generalized for any choice of the derivative order. 

Two choices for the (^-function lead to well known regular- 
ization strategies: 

- ip(t) = t 2 leads to the traditional Tikhonov approach with 
first derivative whereas 

- ip(t) = t is known as the Total Variation (TV) regulariza- 
tion method (e.g. Rudin et al. 119921 Acar & Vogel 119941). 



This approach uses the absolute value rather than the square 
modulus of the solution gradient. It has been shown that 
the solution is searched in a space composed of bounded 
variation functions which admit discontinuity points. This 
regularization method is therefore able to recover piecewise 
smooth soluti ons with steep g radients (see e.g. Dobson & 
Santosa |1994| ; Vogel & Oman |l996Hl997| ). The PP-TSVD 
method of Hansen & Mosegaard ( 1996|), alr eady used in 
helioseismic context in Corbard et al. ( 1998 ), can be seen 
as a 'truncated version' (in the sense that the regularizing 
parameter is a discrete truncation parameter) of the TV reg- 
ularization in the same way as the MTSVD method intro- 
duced by Shibahashi & Sekii (1988) (see also: Hansen et 
al. 1992; Corbard et al. 1998) is a 'truncated version' of the 



Tikhonov regularization. 

For a general (^-function, one can write the criterion Eq. ( 
in a discretized form: 



J(fl) = X 2 (tt) + A 2 J 2 (ft), 



(9) 



where ^(^2) represents the discretized regularization term de- 
fined by: 



^1 



dQ(r) 



dr 



dr = J 2 (0) 



N p -1 

P =i 



I in I 



•(10) 



In this equation (c p ) p= i t N p -i represent the weights used for 
the integration rule, and L is a discrete approximation of the 
first derivative operator. |£Oj is the absolute value of the p th 
component of the vector Ltt. The expression of c p and L are 
given in Appendix for the simple case of the polynomial ex- 
pansion Eq. (Q) 

The minimization of the criterion J(fi) over each compo- 
nent lu p of fi leads to the following Euler equations (discretized 
form): 

VJ(fi) = <=> {R T R + X 2 L T B(n)L)n = RW (11) 

where A = h and B is a diagonal matrix with elements that 
depend on the gradient of the solution at each grid point: 

tp (t) 



B = diag(b p ) with b p 



2t 



For ip(t) — i 2 , B is independent of fi and these normal 
equations reduce to a linear system which corresponds to the 
usual Tikhonov regularization with first derivative. On the other 
hand, for a general (^-function, this leads to a nonlinear prob- 
lem which requires an appropriate iterative method of solution. 

Now, with this general expression for the normal equations, 
the question for our particular problem becomes: what proper- 
ties must the ^-functions satisfy to ensure the preservation of 
high gradients in the solution? The next section shows how the 
theoretical works developed in the field of computed imaging 
gives an answer to this question, and leads to an algorithm for 
solving the non linear Euler equation that is easy to implement. 



3.2. Properties of the weighting function 



<£ W 
it 

From the Euler equation (Eqs. ( JTT| ) and (|l2|)) we can see that the 
function tp (t)/2t acts as a weighting function in the smoothing 
process: at each grid point the gradient of the solution is used as 
an argument of this function in order to set locally the magni- 
tude of regularization. This suggests an iterative process where 
the gradient of the solution at a given step is used for the com- 
putation of the regularization term at the next step. We show in 
the following that we can derive some basic properties of the 
weighting function so that high gradients can be preserved, and 
such that an iterative algorithm for solving the Euler equation 
is possible. First let us look at the behaviour of the weighting 
function at the limits of low and high gradients: 

- For low gradients, we want to keep a Tikhonov regulariza- 
tion. From Eq. ( pT| ) this is the case if tp (t) jit is a non-null 
constant function. 

- For high gradients, we want to remove smoothing. This 
happens when tp (t) /2t is close to zero. 

- Another property that sounds reasonable is to choose a de- 
creasing function of the gradient between these two limits. 
Furthermore, in order to avoid numerical instabilities we 
will choose only strictly decreasing weighting functions. 

Therefore the choice of the tp function must be made tak- 
ing into account the following three properties needed on the 
weighting function tp' (t) /2t (Charbonnier et al. |1994[ |1997[): 



1. Tikhonov smoothing for low gradients: 

< lira = M < oo 
t->o 2t 

2. no smoothing for high gradients: 



lim 



2t 







strictly decreasing on [0, 
to avoid instabilities. 



-oo 



(13) 



(14) 



(15) 



Within these con dition s, the (^-function ma y be c hosen ei- 
ther convex (Green 1990; Charbonnier et al. 1994) or non- 



(12) convex (Perona & Malik 199C; Geman & McClure 1985 
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Hebert & Le ahy |1989[ ) (see also Charbonnier et al. ( |1997| ) and 
Teboul et al. (1998) for examples in both cases). A non-convex 
function may be better suited for the search of high gradi- 
ents. Nevertheless this choice leads to some numerical diffi- 
culties and instabilities related to the existence of local min- 
ima. This may induce a high sensitivity of the inverse process 
to the choice of the regularization parameters (Blanc-Feraud et 
al. 1995). On the other hand, the choice of a convex function 
may avoid these numerical problems and is mo re suitable for 
relatively smooth transition (Blanc-Feraud 1998 ). 

We note that in the case of the TV regularization (or equiv- 
alently the PP-TSVD method) the function ip(t) = t does not 
satisfy the first property (Eq. [l3]). This may explain the diffi- 
culties encountered in using this method (Corbard et al. 1998 ). 
For smooth transitions, the dispersion of the results for differ- 
ent realizations of the noise became large, indicating some in- 
stabilities in the inversion process. In light of the above for- 
malism, this may be related to the non differentiability of the 
corresponding weighting function ip'(t)/2t at t = 0, which can 
lead to numerical instabilities. 



3.3. The iterative algorithm: ARTUR 

Under the three conditio ns (|l3| ) c[l4|) (P~5l) , it has been shown by 
Charbonnier et al. (1994 1997) that the non linear criterion can 



be solved by using an iterative scheme named ARTUR (Alge- 
braic Reconstruction Technique Using Regularization) that is 
easy to implement: at each step k we calculate the regulariza- 
tion term using the derivative of the previous estimate fl k ~ 1 
and we simply compute the new estimate Q k by solving the 
linear system: 



R 1 R + X^L 1 B(n k ~ 1 )L ) fi fc = R W 



(16) 



When we use a convex (^-function, the convergence of this so- 
called half quadratic algorithm (minimiza tion o f a quadratic 
criterion at each step (Geman & Reynolds 1992 )) to the min- 
imum of the criter ion gi ven by Eq. (||) has been established 
(Charbonnier et al. 1997| ). This is also an adaptive regulariza- 
tion method which uses the information on the derivative of the 
solution obtained at each step in order to improve the regular- 
ization at the next step. This requires an initial guess for the 
solution, but we will show in the next section that a constant 
solution can always be used as the starting guess. Figures [j] 
and ^ show examples of ARTUR steps in the case of a discon- 
tinuous rotation rate and for two different levels of noise. At 
each step of the ARTUR algorithm the linear system Eq. ( p"ri| ) 
has been solved using an iterative conjugate gradient method 
with Jacobi preconditioning (see e.g. Golub & Van-Loan |1989| ; 
Barrett et al. 1994) using tt k ~ 1 as starting point. This leads to 



a very fast algorithm where the number of conjugate gradient 
iterations needed to solve the linear system decreases at each 
ARTUR step. The algorithm is stopped when the 2-norm of 
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Fig. 1. Solutions obtained by inverting splittings computed 
from a discontinuous one dimension rotation profile (solid line) 
for the same modeset as in the LOWL data and including 
some 'realistic' noise (see text). The standard Tikhonov solu- 
tion is given for two different automatic choices of the regu- 
larizing parameter. The successive steps of ARTUR algorithm 
are shown in the upper left insert, whereas the final step is 
shown on the main plot. The choice of the regularizing func- 
tion and parameters for ARTUR algorithm are those discussed 
in Sect. [|. The solutions are plotted without error bars for clar- 
ity. 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

ARTUR Steps 




Initial rotation rate 
Tikhonov + L— curve 
Tikhonov + GCV 
ARTUR (final step) 



0.7 

r/R„ 



Fig. 2. The same as Fig. |T] but computed for a lower level of 
noise (standard deviations divided by vTO). Comparison be- 
tween Figs. 0and|| show the smoothing effect of the data noise 
level for the three methods. 

the relative difference between two solutions at two successive 
steps is below 10~ 6 i.e.: 

mfc _ n fc-in. 



\n k \ 



< 10 



-6 



(17) 



4. The choice of y-function and regularizing parameters 
for rotation inversion 
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4.1. The Lp-f unction 

In the particular case of the determination of the solar 
tachocline profile, the uncertainty on the width of t he tra nsi- 
tion zone is still large (see Table 2 of Corbard et al. ( 1998 ) for 
a summary of some previous works). Therefore, according to 
the previous discussion, we have chosen to consider a convex 
regularizing ^-func t ion, s pecifically the one defined in Char- 
bonnier et al. (|1994 11997|): 



tp(t) = + 1 



(18) 



This function is close to the absolute value function used in 
TV regularization and PP-TSVD method, but, unlike the abso- 
lute value, has a quadratic behaviour near that satisfies the 
requirement of Eq. (13) on the contrary of the absolute value. 
The weighting function, shown as dashed lines in Figs. || and 
||, is given by: 



(t)/2t= (1 + t 2 



4.2. The regularizing parameters 



(19) 



The choice of the parameters (A, 8) is an important point, as 
in every regularization methods. For example, u sing a Gener- 
alized Cross Validation (GCV) strategy (Wahba |l977| ) for the 
regularizing parameter of a Tikhonov model leads to an os- 
cillating solution while using the L-curve strategy produces a 
smoother solution. This choice influences evidently the estima- 
tion of the width of the tachocline. It is intuitively not surprising 
as the model represents a priori information on the solution (so 
the solution depends on this information). 

Parameter estimation for regularizing problem is a deep 
question and represen ts an active research ar ea of its own (eg. 
Lakshmanan & Derin 1989 ; Thompson et al. 1991 ; Galatsanos 
& Katsaggelos 1991 ). Some results on (A, 6) parameters have 
been obtained for the p ropose d model in the fi eld of image pro- 
cessing (Chan & Gray 1996; Jalobeanu et al. |1998 ; Zerubia et 
al. 1998 1. However, it is still an open problem, and studies and 
results depend sensitively on the considered inversion problem. 

In the remainder of this paper, a choice of the parameters 
(A, 8) is proposed, based on heuristic considerations (the fol- 
lowing two sub-sections) and simulation results (Sect. ^). The 
application to real data is then discussed in Sect. |6| 

4.2. 1 . The choice of A: using automatic strategies 

If the initial guess is a constant function, then according to 
property 1 (Eq. |l3]) and Eq. (|l9|), M = 1 and the solution at 
the first ARTUR step corresponds to a Tikhonov solution with 
A as regularizing parameter. It has been shown in Corbard et al. 



( 1998) that the GCV strategy for the choice of the regularizing 



parameter in Tikhonov inversions leads systematically to better 
results concerning the evaluation of the tachocline parameters, 



as compared to the L-curve strategy (Hansen 1992). In fact, the 




100 150 200 

dQ/dr (nHz/R e ) 



GCV strategy leads systematically to less smoothing than the 
L-curve approach (XLcurve — 100 * Xgcv m that work) and 



Fig. 3. The solid line shows the first derivative of a first step 
solution (cf. Fig. [[]) in the ARTUR algorithm as a function of 
the fractional solar radius. For each value of the gradient, the 
dashed line gives the weight that will be given locally to the 
regularizing term at the following step of the algorithm. The 
weighting function is given by Eq. (|l9|) with t = \dVL/dr\/5 
and 5 — lOOi?0/nHz. The dotted line indicates the gradient 
for which the local regularization will be reduced by 50% at 
the second step, as compared to the first step. Here, this occur 
for radius between O.65i? and 0.75i? Q . 



therefore is more suited to the study of a region with high gra- 
dients. Nevertheless, because of the low global regularization, 
this choice may lead to spurious oscillations below and above 
the tachocline (see Figs. [[],§)■ The ARTUR algorithm will tend 
to enhance the high gradients found at the first step. Therefore 
it is important to start with a solution smooth enough to avoid 
spurious oscillations with high gradients. The GCV choice for 
A may therefore not be well adapted for ARTUR initial step, 
and some experiments have shown that, on the other hand, the 
L-curve choice leads often to a solution which is too smooth 
and does not allow the buildup of the high gradients expected 
during the iterations. Nevertheless, the optimal choice of this 
parameter strongly depends on the level of noise included in 
the data, and therefore it is important to define an automatic 
choice of this parameter so that we use the same strategy for 
different realizations of the noise in Monte-Carlo simulations 
(cf. Sect. ||). The results of these simulations are shown in Fig. |] 
with both A = A£ cur „ e /10 and A = Xgcv- The second choice 
has been retained for the application to real data (cf. Sect. Irjh. 



4.2.2. The choice of 5: using a priori knowledge on the 
searched gradients 

The parameter 6 has been introduced in order to adapt the shape 
of the weighting function to the gradient that we seek to detect 
in a particular application. We have chosen for simplicity to 
keep this parameter constant during the iterations. As we start 
the iterative process with a constant guess rotation, the first step 
is independent of 5 and thus we can use the solution obtained 
after this initial step in order to adapt the parameter S to the 
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gradients found in the first step solution. As we have shown 
that the first step of ARTUR algorithm correspond to a classic 
Tikhonov inversion, this choice of 8 can be viewed as a way to 
use our a priori knowledge (as given by Tikhonov inversion) of 
the searched gradients. 

It is generally admitted that the width of the tachocline does 
not exceed 0.1 solar radius, which is also the resolution typi- 
cally reached near the tachocline localization (~ O.69i?0) with 
a Tikhonov method applied to the current datasets. Furthermore 
we have a good estimate of the difference between the rotation 
rate above and below the transition (~ 30nHz in Corbard et al. 
(1998)). Therefore we can estimate a level of 300nHz/i? Q for 



the maximum gradient obtained at the first iteration of ARTUR 
process. 

Figure || shows (solid line) the first derivative of solu- 
tion obtained at the first step by inverting artificial splittings 
which have been computed for a discontinuous rotation law 
(cf. Fig. [j]) with a step of 30nHz and by adding some Gaus- 
sian noise with a standard deviation taken from the form al erro r 
given in LOWL data for each mode (cf. Corbard et al. 1998|) 



The weighting function Eq. ( |l9| ) is shown in dashed line for 
8 = lOOi?© / nHz. At the second step we want to preserve only 
high gradients i.e. to regularize less in these zones where high 
gradients have already been found at the first step. For exam- 
ple, according to Fig. ||, the choice of 8 = IOORq /nHz leads 
to regularize 50% less at the second step in that zones where 
the gradient of the first step solution is above ~ 175 nHz/i? Q . 
For the particular realization of the noise introduced in artifi- 
cial data this choice of 6 = IOORq /nHz looks reasonable, in 
the sense that it will tend to decrease the regularization espe- 
cially in the transition zone. A smaller value would enhance the 
secondary peaks that are induced by the data noise. 

The maximum gradient obtained, at the first step, with the 
artificial dataset (350nHz/i? Q in Fig. |^) corresponds approx- 
imately to our previous estimate of 300nHz/i? Q expected for 
real data. For the Monte-Carlo simulations done in order to es- 
timate the errors (cf. Sect. ||) the parameter 8 has been fixed 
to 8 — \00RQ/nHz. Nevertheless, the shape of the solution 
derivative after the first step is a function of the dataset through 
the intrinsic resolution of the modeset and the level of noise. 
Therefore, with real data the choice of 8 will always be made 
by looking at the derivative profile after the first step. We will 
see however in Sect. ^ that other indicators can help in the 
choice of 8, and that according to these indicators the choice 
8 = IOORq /nHz seems to be a good compromise for LOWL 
data also. 



5. On the error estimation on tachocline parameters using 
nonlinear methods 

For nonlinear methods, we cannot compute straightforwardly 
the formal errors at each point of the solution as we can do for 
linear process. The ARTUR algorithm solves a linear system at 
each step, but the final result depends nonlinearly on the data 
since the coefficients of the matrix to be inverted at each step 




0.02 0.04 0.06 0.08 

initial width (fractional solar radius) 

Fig. 4. Monte-carlo simulation for the estimation of the error 
on the width inferred with the ARTUR algorithm. Triangles are 
for A = Xlcutvc/W and circles for A = Xgcv- In both cases 
8 = IOORq /nHz. The rotation profile was taken as an erf 
function with different 'initial widths'. The 'inferred widths' 
are the mean value for 500 noise realizations of the results ob- 
tained by fitting directly the solutions by an erf function (cf. 
Eq. [j] where a is assumed to be zero). Error bars represent a 
68.3% confidence interval on the width. For comparison, the 
squares show the result obtained from Tikhonov method with 
GCV choice of the parameter after a 'local deconvolution' us- 
ing the averaging kernels computed at the center of the transi- 
tion (see Corbard et al. 1998). 



are functions of the data through the derivative of the previous 
estimate that is used as an argument of the weighting function. 

We focus here only on having an estimate of the uncertainty 
on the width w and the location f c of the tachocline. We cannot 
obtain directly error bars on the solution, and therefore the fit 
by an erf function does not give an estimate of the error on 
the inferred parameters. Instead, a first estimate of the uncer- 
tainty on the tachocline width inferred here has been computed 
using a Monte-Carlo method applied on given rotation profiles 
simulated by erf functions with widths lying between 0.01i? Q 
and 0.08i? Q (in steps of 0.01i? Q ) and located at r c = 0.7i? Q . 
The LOWL set of modes splittings corresponding to these ro- 
tation profiles have been computed with addition of a Gaussian 
noise with a standard deviation taken for each mode from the 
formal error given in LOWL data. Since the ARTUR algorithm 
is non linear, we can not define averaging kernels but Fig. |] 
shows that the final step of the algorithm with A = AL CMr „ e /10 
and 8 = IOORq /nHz leads directly to results similar to those 
reached by Tikhonov inversion after a 'local deconvolution' us- 
ing the averaging kernels computed at the center of the tran- 
sition (see Corbard et al. 1998). The la error interval on the 
width is found to be around ±0.02i? Q for widths in the range 
0.01 — O.O8i?0. This uncertainty is probably related to the in- 
trinsic resolution of the modeset and the level of noise con- 
tained in LOWL data. 
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A large bias is found with deconvolved Tikhonov method 
for w < 0.02i?©. This bias is less important for the AR- 
TUR method with A = A£ CTO , t , e /10 and vanishes if we set 
A = ^gcv- Nevertheless, in this latter case error bars are big- 
ger (around ±0.025i? Q in the range 0.02 - O.O8i? of initial 
widths) and the inferred widths tend to be underestimated for 
initials widths between O.O3i?0 and 0.05i?Q. This is due to 
the fact that, in mean for the simulated data, the GCV criterion 
leads to a too strongly oscillating solution that is not a good 
starting point for ARTUR algorithm, as it tends to produce very 
sharp transitions in a large number of realizations for all initial 
widths below O.O5i?0. 

The Monte-Carlo simulations have been performed for 500 
realizations of the input errors for each initial width and the 
mean value over all the realizations (8 x 500) of the inferred 
center f c is 0.701 ± 0.004i? Q . The center of the erf func- 
tion seems therefore to be very well recovered by the inver- 
sion. Nevertheless, we must keep in mind that the center of 
the tachocline is defined as the center of the erf function that 
gives the best fit of the solution. This may not give an appropri- 
ate view of the tachocline profile if, for example, the solution 
is found to lack such a symmetry in the lower and upper parts 
of the tachocline when inverting real data. 

Another important point that Fig. |] demonstrates is the abil- 
ity of the ARTUR algorithm to recover not only rotation with 
a discontinuity (as shown in the examples of Fig. [l] and ||), but 
also rotation with a relatively smooth transition. This property 
does not characterize the PP-TSVD method, and this was the 
reason for the difficulties encountered by Corbard et al. (1998) 
in interpreting the results obtained with this first non linear ap- 
proach. 

Even if both methods give similar results in the mean for 
some choices of the regularizing parameters, it sometimes hap- 
pen that the two solutions differ strongly for a particular real- 
ization of the noise. Furthermore, the two approaches are very 
different in their underlying principles and therefore it is very 
interesting to compare the two results with observed data. 



6. Application to observed data 

6.1. LOWLdata 

The LOWL instrument is a Doppler imager based on a Potas- 
sium Magneto-Optical Filter that has been operating on Mauna 
Loa, Hawaii since 1994 (see Tomczyk et al. ( 1995| ) for a de- 
tailed description). The dataset includes the frequency split- 
tings of 1102 modes (n, I) with degrees up to I = 99 and fre- 
quencies lower than v — 3500 /iHz deduced from a two year 
period of observation ( 2/26/94 - 2/25/96 ). For each mode, in- 
dividual splittings are given by, at best, m = 5 a-coefficients 
of the expansion on orthogonal polynomials defined by Schou 




'ms0 



0.8 0.9 1 



r/R e 



etal. (1994) 



Fig. 5. Solar equatorial rotation rate estimated from LOWL 
data. The vertical error bars given at each grid points are the 
la confidence intervals estimated for the T-GCV solution. The 
dashed line shows ARTUR solution obtained with A = Xqcv 
and 5 = IOORq /nHz. The fits by an erf function (Eq. ([!])) 
of these two solutions are shown respectively by the solid and 
dotted lines. The fits have been computed only between 0.4i? Q 
and O.8i?0 and the five parameters deduced from each fit are 
shown in Fig. [| 



6.2. The choice of inversion parameters and ARTUR solution 
for LOWL data 

Since the first step of the ARTUR algorithm is standard 
Tikhonov inversion, we first present the results obtained from 
Tikhonov method and GCV choice of the regularization pa- 
rameter (called T-GCV method hereafter). The equatorial pro- 
file obtained from T-GCV method on LOWL data is shown in 
Fig. pL The fit of the solution with an erf function of the form 
Eq. (Q) leads to a width of w ~ 0.09i? Q (cf. Fig. |d). By tak- 
ing into account the width of the averaging kernel computed at 
0.7i?Q, the corrected inferred width obtained from this 'local 
deconvolution' is w ~ 0.06i?©. 

Contrary to the example shown in Fig. [j] for simulated data, 
the GCV choice of the regularization parameter does not lead, 
with LOWL data, to an oscillating solution. This may indicate 
that this particular realization of the noise introduced in simu- 
lated data is rather different from the noise contained in LOWL 
data. The formal errors quoted on each a-coefficient are per- 
haps overestimated. Furthermore our model assumes that the 
errors are uncorrelated, which is probably not strictly the case. 
Therefore, with real data, the T-GCV solution may be a good 
starting point for the ARTUR algorithm and we choose to take 
A = Agcv so that the first ARTUR iteration leads to the T- 
GCV solution. 

Figure || shows the absolute value of the first derivative of 
the T-GCV solution. High gradients are found not only in the 
tachocline, but also near the surface (above O.O9i?0). How- 
ever LOWL data include relatively few high degree modes 
(I < 99 in these data), so we will fo cus on ly on the tachocline 
in this work. As discussed in Sect. 4.2.2 , Fig. ^ can help for 
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Table 1. Comparison between our previous work on inferring the equatorial tachocline profile from LOWL data and the results 
obtained with non linear regularization applied to the same data. The previous estimates of the parameters and their errors have 
been deduced from a comparison of three inversion methods, including the T-GCV method. See Sect. |6.3| for a more detailed 
discussion on the tachocline width using the ARTUR algorithm 





fio(nHz) 


Qi(nHz) 


a(nHz/7? ) 


rc/R Q 


w/Rq 


Corbard et al. (1998) 


431.0 ±3.5 


459.0 ±1.5 





0.695 ± 0.005 


0.05 ±0.03 


ARTUR 


430.5 


452.0 


70.0 


0.691 


0.01 




Fig. 6. Same as Fig. | but for LOWL data. The weighting func 
tion is shown for two choices of the parameter S. 



the choice of the parameter 5. The weighting function is shown 
for two values of 6, The choice 5 — IOORq/uHz lowers the 
regularization by more than a factor two at the second step 
in the tachocline and in the upper layers, whereas the choice 
5 = 2OOi?0 /nHz will never decrease the regularization by 
more than 50% after the first step. From this figure, we can 
guess that a choice of S < IOORq/ nHz will tend to en- 
hance spurious oscillations due to the noise, whereas a choice 
S > 200RQ/nHz will lead to a result very similar to the T- 
GCV solution because only few points will be affected by the 
local change of regularization during the ARTUR steps. Be- 
tween these two limits however, it is not clear which is the best 
choice for S. 

However, some indicators may help in the choice of 5: 

- First, an important test for any global inversion is its capa- 
bility of providing a good fit to the data. Figure ^|c shows, 
as a function of S, the normalized \ 2 value for the modes 
which have their turning points between 0.6i?o and O.8i?0. 
As expected, this value is always lower for all ARTUR 
solutions than for the T-GCV solution, because we tend 
to regularize less. The gain in the \ 2 value is however 
small because the regularization is decreased only locally 
in a small region. The x 2 value reaches a minimum for 
S = 50i?o / nHz but, as expected from Fig [| for such low 
S, the ARTUR solution becomes very oscillating and a dis- 
continuity (w ~ 0) is found near the tachocline but in fact 



Fig. 7a-c. Variation of various \ 2 indicators with 5. a Differ- 
ence between the goodness of the fits of ARTUR solutions by 
an erf function with or without a linear part (cf. Eq. (pT|)). 
b Difference between T-GCV and ARTUR solutions in and out 
of the zones where high gradients are expected (cf. Eq. (j2C|)). 
c The normalized \ 2 value of ARTUR inversions for modes 
which have their turning points between 0.6i?© and 0.8-R©. 
The horizontal line shows the value (1.07) obtained from T- 
GCV solution. The star symbol indicate the \ 2 value reached 
for the choice 5 = 100R Q /nHz retained for the ARTUR al- 
gorithm. 



the solution is found to be piecewise constant with many 
discontinuities. 

There is another indicator that we can use showing that this 
value of S — hOR^/nHz is not appropriate, despite the 
good x 2 value reached with this parameter. One of the ob- 
jectives of the ARTUR method is to keep the same reg- 
ularization as in Tikhonov method in zones without high 
gradients. Therefore one expects that the amount of change 
(compared to T-GCV solution) will be more important in 
the tachocline (and possibly near the surface) than in other 
zones. We denote by Cl s A (r) the solution at the final step of 
ARTUR algorithm for a given 6, and by fi T (r) the T-GCV 
solution; we then define the two quantities: 



1 



Ptl\ in 



(n 5 A (r p )-n T (r p )y 



(20) 
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where I m = {p / 0.6 < jfe < 0.8 or r p > 0.9i? Q }, I out = 
[1, N p ] — Ii n and iVj n , N out are the sizes of the two sets. 
Figure [7]b is a plot of these two quantities. It shows that for 
8 lower than 60i?© /nHz the ARTUR algorithm tends to 
alter the solution even in regions where no high gradients 
are expected, but that for values above 8 ~ IOORq/uHz 
there is no longer any changes in these regions, as expected. 

From these two criteria it seems that a choice of 8 = 
IOORq/uHz is a good compromise between minimizing the 
X 2 value for modes which have their turning points near the 
tachocline, and operating changes essentially in zones where 
high gradients are expected. This was precisely the objective of 
the non linear regularization approach. The ARTUR solution is 
shown in Fig. || for this choice 8. The inferred tachocline pa- 
rameters are summarize in Tab. [I] and compared to the results 
obtained by Corbard et al. (1998) for the same dataset. The es- 
timates of the center of the tachocline and the rotation rate in 
the radiative interior obtained from ARTUR algorithm are in 
good agreement with our previous work. The inferred value of 
f2i and a are such that f2i + 0.1a ~ 459nHz, which corre- 
spond to the rotation rate inferred at O.8i?0 and to the value 
of the f2i parameter obtained by the fit of the T-GCV solu- 
tion. The tachocline width inferred from the ARTUR inversion 
{w = O.Oli?©) is smaller than our previous estimate but still 
compatible if we take into account an error of 0.02i? Q as in- 
dicated by the Monte-Carlo simulations. The next section will 
give a more detailed discussion for the interpretation of this re- 
sult by showing how the estimates of the width vary with the <5 
parameter, and during the ARTUR iterations. 

6.3. On the inferred tachocline width 

Figure |^ shows how the parameters, inferred from a fit of the 
final step of the ARTUR algorithm are sensitive to the choice 
of 8. It shows that the relation Cl\ + 0.1a ~ A59nHz is still 
valid for other choices of 8, and that f2 an d varv on ly 
little with 8 whereas the inferred width increases rapidly for 
8 > IOORq/uHz. It reaches O.O55i? for <5 = 2Q0R G /nHz 
which is the limit for the choice of 8 above which we think that 
the ARTUR algorithm is no longer effective. For large values of 
8, the number of step in the ARTUR process becomes very low, 
ARTUR solutions tend to T-GCV solution, and thus, tachocline 
parameters inferred from ARTUR solutions tend toward those 
inferred from T-GCV solution (cf. Fig. |8|). 

We have tried to fit the solutions with or without the linear 
part after the transition (i.e. by searching for the best a coeffi- 
cient or by setting a = in Eq. (jl])). The goodness of the fit is 
defined by: 

X%M= Nf l_ b E (n(r P )-n flt (r p )) 2 (21) 

pel f it 



fit 



where Ifu is the set of indices given by If a = {p / 0.4 < 
-g^ < 0.8}, and Nfu is the size of If#. We denote by X/it a _ 
the goodness of a fit obtained with only four parameters (a = 
in Eq. dl])). In this case the denominator of Eq. (|2l|) becomes 
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Fig. 8a-d. Variation with <5 of the tachocline parameters as de- 
duced by fitting ARTUR solutions by an erf function (cf. 
Eq. ([!])). The dashed lines show the results obtained by search- 
ing only four parameters {0,q, Qi, f c , w) after setting a = 0, 
whereas the solid lines show the results obtained when a is a 
free parameter of the fit. In this latter case, the inferred value of 
a is shown by the dotted line on panel b and the star symbols 
show the results obtained for 8 — IOORq/uHz. The horizon- 
tal lines indicate the values (independent of 8) of the tachocline 
parameters obtained by fitting the T-GCV solution. These hor- 
izontal lines represent limits for high 8 of the parameters in- 
ferred from ARTUR solutions. On panel d the horizontal lines 
indicate the width as inferred directly by the fit of T-GCV so- 
lution. The width corrected by a 'local deconvolution' using 
averaging kernels, is w ~ 0.06i?©. 



Nfu — 4. The two fits are almost equivalent when applied to the 
T-GCV solution (a = 5nHz when it is searched, cf. Fig. |]b), 
but the fit that allows a linear part has been found to be better 
suited for describing ARTUR solutions for all the choices of 
8 (cf. Fig. |7|a). We note however that if one chooses to fit the 
solutions with a = (i.e. to describe the tachocline by a simple 
erf function), then the inferred width would be systematically 
increased by a value up to 0.02i? Q (cf. Fig. ^|d). 

As the increase of the x 2 value between 8 = 100i?o /nHz 
and 8 = 200R & /nHz is low (cf. Fig ^|a), and because of our 
previous estimate of ±0.02i? Q for the uncertainty on the width, 
we cannot exclude a width larger than O.Oli?©. Furthermore 
Fig. ^| shows that the minimum value of the x 2 is reached af- 
ter only six ARTUR iterations, whereas the inferred width still 
decreases from 0.035i?o down to O.Oli?©. This indicates that 
the data themselves do not allow us to choose between widths 
in that range. During the iterations as well as when we vary 
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Fig. 9. Variation of the inferred width (solid line) and the 
X 2 value for modes which have their turning points between 
O.6i?0 and O.8i?0 (dashed line) as a function of the iteration 
number in ARTUR process for 5 = lOOi?© / nHz). The values 
found at the first step are equal to those obtained from the fit of 
the T-GCV solution shown as horizontal lines in Figs. ||d and 
0c whereas the values obtained at the final step are shown by 
star graph markers on these plots. 

the value of 5, it is the amount of regularization introduced in 
high gradients zones that is changed. In that sense, stopping 
ARTUR iteration before its convergence according to the cri- 
terion Eq. ( |l7[ ) would be equivalent to increase the value of S. 
As the needed amount of regularization is related to the level 
of noise contained in the data, the reliability of our result is 
strongly related to our knowledge of the data noise. Since we 
have shown that the GCV criterion may reveal some discrep- 
ancies between the data noise and the simulated noise, the only 
way to gain more confidence on the appropriate choice of S, 
and thus on the result concerning the tachocline width inferred 
from this kind of non linear inversions, will be to increase our 
knowledge of the statistical properties of the data noise and to 
compare with other datasets. However, even the values of the 
width found at the final step for 6 — 20QRq/ nHz or at the 
sixth iteration with 5 = IOORq/ nHz are still below the value 
of O.O6i?0 inferred from T-GCV solution after a 'local decon- 
volution'. The use of nonlinear regularization argue in favor 
of a very sharp tachocline and even a discontinuity can not be 
excluded. Therefore our conclusion on the tachocline width is 
that it is very likely that it is less than O.O5i?0. This is reinforce 
by the simulations (circles in Fig. Q) showing that initial widths 
up to O.O5i?0 can lead to inferred widths of O.Oli?© whereas it 
is excluded (within la error bars) for initial widths larger than 
O.O5i? . 

7. Conclusions 

This work introduces in helioseismic context an approach of 
the inverse problem that use an adaptive regularization which 
is then used toward on the estimation of the width of the 
tachocline. We do not claim that this paper gives a definitive 



answer to this question. Instead, this paper presents a new tool 
which allows to accurately reconstruct rotation profiles with 
possibly both smooth and abrupt variations with depth. This 
approach leads to a non linear problem that can be solved eas- 
ily by an iterative process named ARTUR, initially developed 
in the field of image processing. It is shown that this allows to 
recover high gradients in the solution and avoids the spurious 
oscillations (known as 'Gibbs phenomena') that may be found 
when we try to recover such sharp transition zones with the 
usual Tikhonov approach. 

The proposed procedure for choosing the regularizing pa- 
rameters and the Monte-Carlo simulations represent a first step 
showing the feasibility and the capability of the method to re- 
trieve both small and large tachocline widths from noisy ob- 
servations. They have shown that this method, as well as the 
Tikhonov inversion with 'local deconvolution' is able to re- 
cover the width of erf functions with widths from to O.O3i?0 
up to O.O8i?0 with the same error estimation of ±0.02i? Q . Fol- 
lower values of the width, the results are more sensitive to the 
choice of the regularizing parameters. However, some improve- 
ments may be brought by the studies on the optimal choice of 
the regularizing parameters that are underway and their future 
application to the rotation inverse problem. 

The inversion of LOWL data using the ARTUR method 
gives an equatorial tachocline profile which differs from our 
previous work. These new results favor a sharp transition 
(down to a width of O.Oli?© with the adapted regularization 
parameters). From our study of the inferred width as a function 
of the parameter 6 and from our estimate of the uncertainty on 
this parameter, we conclude that the width of the tachocline 
should be less than 0.05i?Q. This estimate is somewhat differ- 
ent from our previous estimate of 0.05 ± 0.03i?o which allows 
relatively smooth transitions up to 0.08i?©. 

The change in our estimate of the width is partly due to 
the fact that we have changed the fitting function that defined 
the tachocline parameters. It is shown here that adding a lin- 
ear behaviour in the upper part of the erf function allows a 
better fit of the solution. Whereas the T-GCV solution (before 
any deconvolution) can be well approximated by a simple erf 
function between 0.4i? Q and O.8i?0, the ARTUR solution is 
better approximated, in the upper part of the tachocline, by a 
linear function with a slope around 7OnHz/i? that goes from 
452nHz at 0.7i? Q up to 459nHz at 0.8i? Q . It was not possi- 
ble to reach this conclusion from our previous approach using 
the T-GCV solution because the 'local deconvolution' used in 
this approach supposed explicitly that the rotation profile can 
be well approximated by a simple erf function (without lin- 
ear behaviour). It will be therefore very interesting to study in 
future works and with other datasets the rotation profile just 
above the tachocline, in order to become more confident of our 
result. 

An important contribution of the non linear regularization 
approach is that it allows to find directly a solution with sharp 
transitions without fixing a priori the shape of the transitions, 
as necessary in forward modeling or when we deconvolve a 
solution obtained from linear methods. In order to describe the 
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tachocline with only few parameters, we can afterwards choose 
a shape for the fitting function that is adapted to the solution 
found. 

Finally, we note that this approach to inversion with non 
linear regularization may find other applications in the helio- 
seismic context, for any the problems where high gradients are 
expected in the solutions. We have shown that this is the case 
for the rotation of the surface layers, which can be studied more 
accurately with instruments such as MDI on board SoHO, or 
the GONG network which observe high degree modes. This al- 
gorithm may also be applied to the sound speed anomaly found 
between the real Sun and solar models by structural inversions. 
The width of this peak, which is located in the tachocline, can 
be related to the width of the mixed zone which is supp osed t o 
exist just below the convection z one ( e.g. Morel et al. 1998 ). 
Some recent work (Elliott et al. 1998) have used a linear in- 



version which has been deconvolved to give an estimate of 
0.02i?Q for this width. It is interesting to note that this is of 
the same order as our estimate of the tachocline width deduced 
from the rotation profile. As the sound speed anomaly profile 
is also a zone with high gradients in the solution of an inverse 
problem, the use of non linear regularization may also be an 
alternate approach to address this problem in future work. 
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Appendix: terms of the discretization 

In our application, the polynomial expansion Eq. (^) is such 
that: 



(A.l) 



I Zr_\ tfr p _!<r<r v 

otherwise 

where (r p ) p= o t N p +i are the fixed break points distributed ac- 
cording to the density of turning points of modes. We have: 

= r = ?'i < r 2 < ..r Np -i < r Np = r Np+1 = i? Q , (A.2) 

therefore, each coefficient oj p of the expansion Eq. (^) simply 
represents the solution at the radius r p : 



V p = 1, ..N p tt(r p ) = Lu p . 



(A3) 



Furthermore, with this expansion of the solution, the first 
derivative of Cl(r) is represented by a piecewise constant func- 
tion. Therefore, in this trivial case, one can take in Eq. (|l0|): 



'p+ 



i-r p , p = l,..N p -l C = diag{c p ) (A.4) 



and the first derivative operator is defined by the bi-diagonal 
matrix: 



with: 



-d) 

J i,3 



-Si.-, + Si, 



1,N P - 

: l,N p 



(A.6) 



L = C~ l L^ 



(A.5) 
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